Summarizing with Encyclopedic Knowledge

نویسندگان

  • Vivi Nastase
  • Katja Filippova
  • David N. Milne
چکیده

This paper presents a topic-driven multidocument summarization approach that relies on linking documents to Wikipedia. Wikipedia provides structural support to retrieve relevant concepts from the documents to be summarized, and quantify the strength of the relations between them, thus expanding the topic. We identify concepts in the documents, and assign them scores that describe their relevance to the topic, their significance in general, and a machine-learned confidence that they should appear in the summary. Sentences are ranked according to the scores of the concepts within them and how much new information they provide. The best are extracted and compressed to form the summary. The system is trained and developed using the DUC 2005 and 2006 data. It was tested on the DUC 2007 data before deploying it on the update summarization task of TAC 2009. It performs 5th (compared to 30 peers) in DUC 2007, and 21st (of 52 peers) on the TAC 2009 update task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Summarizing Encyclopedic Term Descriptions on the Web

We are developing an automatic method to compile an encyclopedic corpus from the Web. In our previous work, paragraph-style descriptions for a term are extracted from Web pages and organized based on domains. However, these descriptions are independent and do not comprise a condensed text as in hand-crafted encyclopedias. To resolve this problem, we propose a summarization method, which produce...

متن کامل

Linking Educational Materials to Encyclopedic Knowledge

This paper describes a system that automatically links study materials to encyclopedic knowledge, and shows how the availability of such knowledge within easy reach of the learner can improve both the quality of the knowledge acquired and the time needed to obtain such knowledge.

متن کامل

"The sum of all human knowledge": A systematic review of scholarly research on the content of Wikipedia

Wikipedia might possibly be the best-developed attempt thus far of the enduring quest to gather all human knowledge in one place. Its accomplishments in this regard have made it an irresistible point of inquiry for researchers from various fields of knowledge. A decade of research has thrown light on many aspects of the Wikipedia community, its processes, and content. However, due to the variet...

متن کامل

MASAQ: A Multi-Agent System for Answering Questions Based on an Encyclopedic Knowledge Base1

In this paper, we present a multi-agent system, called MASAQ, for answering users’ queries based on an encyclopedic knowledge base. MASAQ has three major components: (1) a natural language interface; (2) an executable specification language (EASL) for developing multi-agent systems for answering or reasoning about users’ queries; (3) an encyclopedic knowledge base covering twenty-one domains. I...

متن کامل

Combining Collocations, Lexical and Encyclopedic Knowledge for Metonymy Resolution

This paper presents a supervised method for resolving metonymies. We enhance a commonly used feature set with features extracted based on collocation information from corpora, generalized using lexical and encyclopedic knowledge to determine the preferred sense of the potentially metonymic word using methods from unsupervised word sense disambiguation. The methodology developed addresses one is...

متن کامل

Final Paper: Large-Scale Alignment of Encyclopedic Metabolic Networks

Encyclopedic metabolic networks capture the sum of human knowledge of biochemical reactions and substrates as found across a multitude of organisms, and not one particular organism. A method for large-scale alignment of two such encyclopedic metabolic networks, KEGG and MetaCyc, has been designed to allow for a systematic comparison of their contents. A variety of methods for matching reactions...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009